Stemming Arabic Conjunctions and Prepositions

نویسندگان

  • Abdusalam F. A. Nwesri
  • Seyed M. M. Tahaghoghi
  • Falk Scholer
چکیده

Arabic is the fourth most widely spoken language in the world, and is characterised by a high rate of inflection. To cater for this, most Arabic information retrieval systems incorporate a stemming stage. Most existing Arabic stemmers are derived from English equivalents; however, unlike English, most affixes in Arabic are difficult to discriminate from the core word. Removing incorrectly identified affixes sometimes results in a valid but incorrect stem, and in most cases reduces retrieval precision. Conjunctions and prepositions form an interesting class of these affixes. In this work, we present novel approaches for dealing with these affixes. Unlike previous approaches, our approaches focus on retaining valid Arabic core words, while maintaining high retrieval performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Attacking Parsing Bottlenecks with Unlabeled Data and Relevant Factorizations

Prepositions and conjunctions are two of the largest remaining bottlenecks in parsing. Across various existing parsers, these two categories have the lowest accuracies, and mistakes made have consequences for downstream applications. Prepositions and conjunctions are often assumed to depend on lexical dependencies for correct resolution. As lexical statistics based on the training set only are ...

متن کامل

Prepositions and Conjunctions in a Natural Language Interfaces to Databases

This paper present the treatment of prepositions and conjunctionsinnatural languageinterfacestodatabases(NLIDB)thatallowsbetter translation of queries expressed in natural language into formal languages. Prepositions and conjunctionsweren’t sufficiently studied for their usage in NLIDBs,becausemostoftheNLIDBsjust lookforkeywordsinthesentences and focus their analysis on nouns and verbs getting ...

متن کامل

DeQue: A Lexicon of Complex Prepositions and Conjunctions in French

We introduce DeQue, a lexicon covering French complex prepositions (CPRE) like à partir de (from) and complex conjunctions (CCONJ) like bien que (although). The lexicon includes fine-grained linguistic description based on empirical evidence. We describe the general characteristics of CPRE and CCONJ in French, with special focus on syntactic ambiguity. Then, we list the selection criteria used ...

متن کامل

Buckwalter-based Lookup Tool as Language Resource for Arabic Language Learners

The morphology of the Arabic language is rich and complex; words are inflected to express variations in tense-aspect, person, number, and gender, while they may also appear with clitics attached to express possession on nouns, objects on verbs and prepositions, and conjunctions. Furthermore, Arabic script allows the omission of short vowel diacritics. For the Arabic language learner trying to u...

متن کامل

Temporal Prepositions and Their Logic

This paper investigates the computational complexity of reasoning with English sentences featuring temporal prepositions, temporal subordinating conjunctions and the order-denoting adjectives ‘first’ and ‘last’. A fragment of English featuring these constructions, called TPE, is defined by means of a context-free grammar. The phrasestructures which this grammar assigns to the sentences it recog...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005